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1. INTRODUCTION 

The prediction of energy is vital for electricity traders who buy and sell electricity, change loads, 
organize maintenance and unit commitment to balance their purchase of electricity, as well as to supply their 
consumers with optimal price products. The need of electricity is increasing so rapidly and the impact of this 
is so crucial and harmful for our environment. Use of electricity in the USA in 2018 was 16 times greater than 
that of in 1950 [1]. According to the energy information administration data, the consumption of electricity can 
be 79% greater through 2050 [2]. That’s why proper prediction of electrical energy consumption is really 
needed to control the excessive use or to reduce waste of energy to minimize its harmful effect on 
the environment. Saving of electricity does not require a lot of cash. It needs a proper way of consuming 
electricity in the most efficient way [3]. Most of the proposed prediction models usually use statistical method 
as a tool for forecasting future data [4]. The suitable prediction models are recognized from some factor like 
prediction period, prediction interval, the duration of the time series, characteristics of time series [5]. In this 
paper a comparative study is presented for statistical prediction of two commonly used linear demand models 
for energy consumption in Ohio/Kentucky. The models are a trendy time series linear model named 
autoregressive integrated moving average (ARIMA) [6] model and Holt-Winters model. These models have 
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exceptional adaptive capacity to deal with the linearity in problem solving [7]. We applied ARIMA and Holt 
Winters models for several reasons. ARIMA is a very strong time series model where a series past data is used 
as an independent variable. ARIMA forecast data are generally more reliable and accurate [8]. Holt Winters 
model is easier to apply, provides accurate forecast and recent observations are given significance here [9]. 

The comparison is made by considering the lowest root mean square error (RMSE), mean absolute 
percentage error (MAPE), mean absolute error (MAE) and the highest value of mean absolute percentage 
accuracy (MAPA) for time series predictions. Our main contribution is to represent the best suitable and 
accurate technique among two powerful time series model- ARIMA and Holt Winters model. Our work is done 
for daily, weekly and monthly electricity consumption from 2012 to 2018 of Duke Energy Ohio/Kentucky [10]. 
In this paper the section 2 presents some existing works related to time series data and other models. Section 3 
represents theoretical explanation of our work and models. Section 4 explains the methodology. Section 5 
shows the analysis of the result and section 6 concludes our work. 


2. RELATED WORKS 

Different researches have been done so far with time series data. Ma et al. [11] applied support vector 
machines (SVM) to forecast energy consumption in buildings of China. Their analysis showed that SVM 
method can forecast energy consumption with quite a good accuracy. They didn’t use other models in their 
paper or didn’t show any comparative result. Nie et al. [12] used an ARIMA and SVM hybrid technique for 
load forecasting. They used this hybrid model to achieve a better accuracy. But the hybrid model was not 
clearly mathematically explained. Fard and Zadeh [13] presented a hybrid technique for short time load 
forecasting which is based on wavelet transform, Artificial Neural network and ARIMA. This hybrid model 
can increase the complexity in many cases. In [14] several time series models have been used to predict the 
consumption of electricity of University Tun Hussein Onn Malaysia (UTHM). The models they used are simple 
moving average (SMA), weighted moving average (WMA), Holt-Winters (HW), Holt linear trend (HL), simple 
exponential smoothing (SES) and centered moving average (CMA). From their analysis they found that HW 
gives the smallest MAE and MAPE. 

Pan [15] showed for Airline passenger data that prediction of Holt-Winters technique is more accurate 
compared to ARIMA model. In their case they could have experimented with data of different periods by re- 
sampling the data. Chujai et al. [16] presented a suitable model and period among weekly, daily, quarterly or 
monthly for forecasting household energy consumption. They used ARMA model and ARIMA model for data 
analysis. They used the lowest value of AIC to figure out the most appropriate model and the least value of RMSE 
to find the appropriate period. They didn’t use other metrics like MAPE, and MAE to analyze the result. 

Bouktif et al. [17] used long short term memory based model to forecast electricity consumption. By 
selecting the best base line, choosing the appropriate features and applying genetic algorithm, they found an 
optimal model prediction. In [18] support vector regression has been applied on the data of energy consumption 
of different buildings. This data is hourly observation of different buildings. They didn’t use different models 
and different periods of data for better analysis. Vinagre et al. [19] proposed a forecast method of energy 
consumption based on artificial neural network for an office building using the data from October 2014 to April 
2015. Their average MAPE based on the best network was 13.6%. This percentage is quite moderate and can 
be better using other models or hybrid models. 

Yoo and Myriam [20] proposed a model based on neural network to predict the residential electricity 
consumption in Seoul using the historical data from January 1996 to July 2016. They found out some interesting 
characteristics which have direct impact on the data. The MAPE of the total dataset was 4.85%. Bilal Şişman [21 | 
made a comparison between ARIMA and Grey Model by forecasting the electricity consumption in Turkey 
where it has been found that the MAPE of ARIMA and Grey Model were 4.9% and 5.6% respectively. They 
also analyzed that both are quite better than MAED model which has MAPE of 14.8%. This study can further 
be applied to other fields of forecasting along with other models to increase its accuracy and efficiency. 

In [22] 4 different models based on multi objective genetic algorithm have been applied to predict the 
power consumption using the data from 1 September 2010 to 29 February 2012 of a building in Spain. They 
compared these models with existing perceptron model [23] and a naive autoregressive baseline (NAB) [24] 
model. The MAPE of the NAB model showed the worst performance. An auto-encoder using deep learning 
model has been proposed in [25] by the researchers for predicting electric energy consumption. They used 
household power consumption of five years and obtained mean squared error of 0.384 but they haven’t 
mentioned any percentage of error. This work also can be extended using several household data or the data of 
a whole area to make it more applicaple. Kim and Cho [26] proposed CNN-LSTM neural network which can 
predict the energy consumption of the housing effectively and achieved RMSE of 61.14%. 

In [27] the researchers proposed LSTM and GRU to predict the traffic flow. It has been shown that 
MSE of two RNN is smaller than ARIMA and GRU gives better performance than LSTM in 84% of total time 
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series. Tso and Yau [28] presented three techniques- regression analysis, neural network and decision tree for 
the prediction of energy consumption. They compared the three-model based on their RASE. They found out 
that the RASE of the three models are quite similar. Others performance metrics could be used besides RASE 
to have a better comparison among the models. 


3. THEORETICAL FRAMEWORK 

Time series analysis is on historical data which is taken collectively for continuous period which can 
be of different periods- hourly, daily, weekly, monthly, quarterly, and yearly depending on the usage and scope 
of users. We are using ARIMA as well as Holt Winters model for analyzing the time series data of energy 
consumption for Duke Energy Ohio/Kentucky [10]. Figure 1 shows the general flow diagram of our work. 
Here the hourly observations of energy consumption are used and later resampled to daily, weekly and monthly 
data. Further the ARIMA and Holt Winters model are applied on these data separately. 
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Figure 1. Work flow of analysing time series 


3.1. ARIMA model 

ARIMA is a technique that analyzes autocorrelation in the time series by modeling it directly. It is the 
combination of autoregressive (AR) process, moving average (MA) process and stationery (I) series [29]. It is 
an integrated (I) series to become static, it has to be differentiated. The (AR) terms are the Lags of the stationary 
series. Predictive error lags are, in actual fact, moving averages known as MA terms. It uses Box and Jenkins 
approach which has been extensively applied in studies of time series forecasting [30]. The ARIMA model 
consists of three parameters (p, d, q) [31]. Here, p is the autoregressive component’s order, d means the amount 
of dissimilarities required to make the series stationary ARMA (p, q) and q is moving average component’s 
order [32]. 


3.2. Holt Winters model 
Holt-Winters model is a technique to forecast the characteristics of a time series data. It is a very 
popular forecasting technique for time series data. It works with three features of the time series: a regular/mean 
value, a cyclical pattern which repeats seasonally and a slope or trend over time. The model combines the effect 
of these three aspects to predict or guess a present or future data. This model is called triple exponential 
smoothing as the 3 features of the time series analysis-typical/regular value, slope or trend value, and 
seasonality are represented as 3 types of exponential smoothing. The model needs some parameters (a, P, y)- 
one for each smoothing, the duration of a season, and the amount of periods in a season. 
— a (Coefficient of level smoothing or base value): The parameter a determines the weight of the past 
values [33]. 
— (Coefficient of trend smoothing or trend value): The parameter P determines the degree of the recent trend 
values [33]. 
— vy (Coefficient of seasonality smoothing or seasonal component): The parameter y indicates the coefficient 
for the seasonal smoothing [34]. 


3.3. Performance metrics 
— Mean absolute error (MAE): It is calculated by (1). 
MAE = Sum of the absolute error of n observation/ n [35] (1) 
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— Root mean squared error (RMSE): The square root of the mean squared differences between forecasted and 
test (actual) observation is calculated by RMSE as shown in (2). 


RMSE = [20 -9;)? [36] (2) 


Here n= number of observations, y; =true value, ¥;=predicted value. 
— Mean absolute percentage error (MAPE): It can be measured by (3). 


|A-F| 
X z x100 





MAPE = [37] (3) 
N= number of total observations, A=True value, F=Predicted value. 

— Mean absolute percentage accuracy (MAPA): Mean absolute percentage accuracy can be measured simply 
subtracting the MAPE value from 100. This gives us the accuracy in percentage. 


4. METHODOLOGY 

Our main goal is to apply ARIMA and Holt-Winters model in our data set (Duke Energy 
Ohio/Kentucky) to forecast some energy consumption values for daily, weekly and monthly data and make a 
comparison of the results to find the suitable model for forecasting. We worked with python 3 using jupyter 
notebook. The Workflow of our analysis is given in Figure 1. 


4.1. Dataset 

The dataset we used is an hourly energy consumption data taken from PJM's website [10]. PJM 
Interconnection LLC is a regional transmission organization (RTO) in the USA. Figure 2 (a) shows a sample 
of the dataset we used and Figure 2 (b) shows the specifications of the dataset. The dataset symbolizes the 
hourly energy consumption (in Mega Watt) of Ohio/Kentucky [10]. 


4.2. Loading and resampling the data set 
The data set contains 57740 observations of hourly energy consumption from 12/31/2012 1:00 AM 
to 1/2/2018 12:00:00 AM. Firstly, the dataset is loaded in the csv format using panda’s library. Original dataset 
is provided in Figure 3 where the hourly observations in mega watt are graphically shown. Then we resample 
the dataset into daily, weekly and monthly observations. After re-sampling into daily, weekly and monthly 
observations, the datasets are graphically shown in Figure 4, Figure 5 and Figure 6 respectively. 
We split our data set into test set and training set. Then we apply the models on the training set to 
examine the test set. The training and test set are split accordingly: 
— Daily observations: First 1800 observation are used as training set and rest of them as test set of total 2407 
observations for both ARIMA and Holt Winters model. 
— Weekly observations: First 260 observations are used as training set for Holt-Winters model, first 300 
observations as training set for ARIMA model and rest of them as test set of total 344 observations. 
— Monthly observations: First 60 observations are used as training set and rest of them as test set both ARIMA 
and Holt Winters model of total 80 observations. 


4.3. Building models for prediction 
— Autoregressive integrated moving average (ARIMA) model 

The first thing of building an ARIMA model is to find out the most appropriate order of ARIMA 
(p, d, q). The function auto arima () is used which provides best ARIMA model based on Akaike Information 
Criterion (AIC) value. In our case the most suitable order for daily, weekly and monthly data are (3, 0, 3), 
(1, 0, 1) and (2, 1) respectively. 
— Holt Winters model 

To build the model, the function ExponentialSmoothing () is used. An important part of building the 
model is to determine the suitable parameters (a, B, y). The most suitable value for alpha and gamma are 
determined for our dataset. Here are the coefficient values- 
a= 0.85 & y= 0.15 for daily data 
a = 0.23 & y =0.0 for weekly data 
a = 0.56 & y = 0.0 for monthly data 
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Figure 3. Original dataset (hourly observations) 
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Figure 5. Weekly dataset after resampling 
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Figure 4. Daily dataset after resampling 
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4.4. Predicting values 

After applying the models on the test set, obtained results are illustrated in Figures 7-12. Each of the 
figures represents the original and predicted values. In the Figures 7-12, the comparison of the original and 
forecasted data is shown using two models (ARIMA and Holt Winters). Here the x axis indicates the time 
period (day, week or month) and the y axis indicates the energy consumption in Mega Watt. 
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5. RESULT ANALYSIS AND DISCUSSION 
5.1. Analysis from the output graph 

Figures 7 and 8 represent the output of daily observations of energy consumption using ARIMA and 
Holt Winters model respectively. It shows that the ARIMA model doesn’t follow the original values as the 
Holt Winters model does. Figures 9 and 10 represent the output of weekly observations of energy consumption 
using ARIMA and Holt Winters model respectively. Here we see that Holt Winters Model follows the original 
values but ARIMA model gives almost a linear output. Figures 11 and 12 represent the output of monthly 
observations of energy consumption using ARIMA and Holt Winters model respectively. Like the previous 
two datasets also here Holt Winters model follows the original values more precisely than ARIMA model. It 
gives us an intuition that Holt Winters model outperforms ARIMA model in each case. 


5.2. Analysis using evaluation metrics 

After obtaining the results of the two models for daily, weekly and monthly data, the results are 
compared using the MAE, RMSE, MAPE and MAPA values. These are shown in Table 1. From the Table 1, 
we observe that for daily, weekly and monthly, in all cases the error values (MAE, RMSE and MAPE) are less 
for Holt Winters model. The Accuracy (MAPA) is greater for Holt Winters model for each case. For ARIMA 
model the accuracy in daily, weekly and monthly data are 88.51%, 89.814% and 90.85% respectively. For Holt 
Winters model the accuracy in daily, weekly and monthly data are 89.74%, 94.589% and 95.64% respectively. 


TELKOMNIKA Telecommun Comput El Control, Vol. 19, No. 3, June 2021: 991 - 1000 


TELKOMNIKA Telecommun Comput El Control o 997 


And we also determine that the greater the period, the more accurate is the result. We get the best accuracy for 
monthly data using Holt Winters model that is 95.64%. 


5.3. Comparison with other existing works 

Finally, we compare our two models with few other models. The comparison is based on the mean 
absolute percentage error (MAPE). There exist many other analyses of the prediction models for predicting the 
electric energy consumption. Table 2 gives us the idea of other analyses and our analysis. We can see from 
Table 2 that Holt winters model has the minimum MAPE (incase of monthly observations) than all other 
models. Our obtained MAPE value is 4.36% for monthly observations using Holt Winters model. Thus, we 
can use this model for predication of energy consumption with a view to making a proper plan of electricity 
supply and minimizing the energy waste for sustainable development. This model also can be used in other 
time series prediction for proper management and decision making. 


Table 1. Comparison of Two models based on different evaluation metrics 


Period Evaluation metrics ARIMA Holt Winters 
Daily Observation MAE 343.41 313.81 

RMSE 426.09 407.63 
MAPE 11.49% 10.26% 
MAPA 88.51% 89.74% 

Weekly Observation MAE 303.48 171.58 
RMSE 357.18 222.02 
MAPE 10.186% 5.411% 
MAPA 89.814% 94.589% 

Monthly Observation MAE 316.21 134.44 
RMSE 347.18 184.12 
MAPE 9.15% 4.36% 
MAPA 90.85% 95.64% 


Table 2. Comparison of our obtained MAPE with other existing analysis 


Model References Data Set Method Used MAPE (%) 
Eugénia Vinagre, Luis Oct 2014-Apr 2015 (Energy consumption for ANN 13.6 
Gomes, Zita [19] an office building) 

Sang Guun Yooa, Myriam Jan 1996- Jul 2016 (Residential electricity Neural Network 4.85 


Hernández-Álvarez[20] 
Bilal SISMAN[21] 


consumption of Seoul) 
1970-2013 (Turkish Statistical Institute and ARIMA 4.9 
Electricity distribution and consumption 


statistics of Turkey) Grey Model 5.6 
Khosravani, Castilla, 1 Sep 2010-29 Feb 2012 (Solar energy Multi Objective Genetic 5.04 (Average of 
Berenguel, Ruano, research centre  bioclimatic building, Algorithm (MOGA) the best models) 
Ferreira[22] University of Almeria, Spain) 
Models we used to analyze 12 Dec 2012-1 Feb-2018 Energy consumption ARIMA (for monthly data) 9.15 
of Ohio/Kentucky from PJM’s Website [10] Holt Winters 4.36 
(for monthly data) 


6. CONCLUSION 

One of our important concerns today is the appropriate management of energy and hence an accurate 
predicting model for forecasting energy consumption is required. We tried to apply the ARIMA and 
Holt Winters model in the energy consumption data of Ohio/Kentucky from PJM's website [10] and made a 
comparative analysis of the results. The main aim of our study was to find the suitable model among the two 
models for daily, weekly and monthly data for appropriate prediction. We determined the best suited model 
based on the minimum value of MAE, RMSE and MAPE. From our analysis it was observed that Holt Winters 
model provided more accuracy for the data sets in each case (daily, weekly, monthly). 

Later we compared few other existing models with our two models which also reflect that the Holt 
Winters model that we used has greater accuracy for the monthly observations. So, we can conclude that for 
this kind of long-term forecasting, our proposed Holt Winters model can work efficiently for proper energy 
management. Further studies can be done with the similar dataset considering other parameters and 
environmental factors which has a greater impact on the data. We can also work using some other hybrid 
models or models like ANN and genetic algorithm to be able to have better result on short term load forecasting. 
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